Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 379 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 38.6 KiB |
| Average record size in memory | 104.3 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 1 |
CRIM is highly correlated with RAD and 1 other fields | High correlation |
ZN is highly correlated with INDUS and 3 other fields | High correlation |
INDUS is highly correlated with ZN and 6 other fields | High correlation |
NOX is highly correlated with ZN and 6 other fields | High correlation |
RM is highly correlated with LSTAT | High correlation |
AGE is highly correlated with ZN and 5 other fields | High correlation |
DIS is highly correlated with ZN and 5 other fields | High correlation |
RAD is highly correlated with CRIM and 4 other fields | High correlation |
TAX is highly correlated with CRIM and 6 other fields | High correlation |
LSTAT is highly correlated with INDUS and 6 other fields | High correlation |
CRIM is highly correlated with ZN and 7 other fields | High correlation |
ZN is highly correlated with CRIM and 4 other fields | High correlation |
INDUS is highly correlated with CRIM and 6 other fields | High correlation |
NOX is highly correlated with CRIM and 7 other fields | High correlation |
RM is highly correlated with LSTAT | High correlation |
AGE is highly correlated with CRIM and 6 other fields | High correlation |
DIS is highly correlated with CRIM and 6 other fields | High correlation |
RAD is highly correlated with CRIM and 2 other fields | High correlation |
TAX is highly correlated with CRIM and 6 other fields | High correlation |
LSTAT is highly correlated with CRIM and 6 other fields | High correlation |
CRIM is highly correlated with INDUS and 4 other fields | High correlation |
ZN is highly correlated with INDUS and 1 other fields | High correlation |
INDUS is highly correlated with CRIM and 4 other fields | High correlation |
NOX is highly correlated with CRIM and 4 other fields | High correlation |
AGE is highly correlated with NOX and 1 other fields | High correlation |
DIS is highly correlated with CRIM and 3 other fields | High correlation |
RAD is highly correlated with CRIM and 1 other fields | High correlation |
TAX is highly correlated with CRIM and 2 other fields | High correlation |
LSTAT is highly correlated with INDUS and 8 other fields | High correlation |
INDUS is highly correlated with LSTAT and 8 other fields | High correlation |
ZN is highly correlated with INDUS and 6 other fields | High correlation |
DIS is highly correlated with LSTAT and 7 other fields | High correlation |
RAD is highly correlated with LSTAT and 7 other fields | High correlation |
NOX is highly correlated with LSTAT and 8 other fields | High correlation |
PTRATIO is highly correlated with LSTAT and 8 other fields | High correlation |
B is highly correlated with LSTAT and 2 other fields | High correlation |
AGE is highly correlated with LSTAT and 6 other fields | High correlation |
RM is highly correlated with LSTAT and 3 other fields | High correlation |
CRIM is highly correlated with LSTAT and 4 other fields | High correlation |
TAX is highly correlated with INDUS and 5 other fields | High correlation |
ZN has 280 (73.9%) zeros | Zeros |
Reproduction
| Analysis started | 2021-06-04 16:29:39.037778 |
|---|---|
| Analysis finished | 2021-06-04 16:29:49.974112 |
| Duration | 10.94 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 377 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.61310628 |
| Minimum | 0.01301 |
|---|---|
| Maximum | 88.9762 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 0.01301 |
|---|---|
| 5-th percentile | 0.03034 |
| Q1 | 0.082325 |
| median | 0.25356 |
| Q3 | 3.630895 |
| 95-th percentile | 15.60416 |
| Maximum | 88.9762 |
| Range | 88.96319 |
| Interquartile range (IQR) | 3.54857 |
Descriptive statistics
| Standard deviation | 9.010515446 |
|---|---|
| Coefficient of variation (CV) | 2.493841794 |
| Kurtosis | 38.89161158 |
| Mean | 3.61310628 |
| Median Absolute Deviation (MAD) | 0.21588 |
| Skewness | 5.449813532 |
| Sum | 1369.36728 |
| Variance | 81.18938859 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 14.3337 | 2 | 0.5% |
| 0.01501 | 2 | 0.5% |
| 0.57834 | 1 | 0.3% |
| 0.03705 | 1 | 0.3% |
| 5.73116 | 1 | 0.3% |
| 13.3598 | 1 | 0.3% |
| 0.06617 | 1 | 0.3% |
| 0.21124 | 1 | 0.3% |
| 4.89822 | 1 | 0.3% |
| 0.25199 | 1 | 0.3% |
| Other values (367) | 367 |
| Value | Count | Frequency (%) |
| 0.01301 | 1 | |
| 0.01381 | 1 | |
| 0.01432 | 1 | |
| 0.01439 | 1 | |
| 0.01501 | 2 | |
| 0.01538 | 1 | |
| 0.01709 | 1 | |
| 0.0187 | 1 | |
| 0.01951 | 1 | |
| 0.02009 | 1 |
| Value | Count | Frequency (%) |
| 88.9762 | 1 | |
| 73.5341 | 1 | |
| 67.9208 | 1 | |
| 45.7461 | 1 | |
| 41.5292 | 1 | |
| 38.3518 | 1 | |
| 28.6558 | 1 | |
| 25.9406 | 1 | |
| 25.0461 | 1 | |
| 24.8017 | 1 |
| Distinct | 25 |
|---|---|
| Distinct (%) | 6.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.75725594 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 280 |
| Zeros (%) | 73.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 12.5 |
| 95-th percentile | 80 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 12.5 |
Descriptive statistics
| Standard deviation | 22.41265631 |
|---|---|
| Coefficient of variation (CV) | 2.083491965 |
| Kurtosis | 4.517109163 |
| Mean | 10.75725594 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.310886144 |
| Sum | 4077 |
| Variance | 502.3271628 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=25)
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 20 | 15 | 4.0% |
| 80 | 12 | 3.2% |
| 22 | 9 | 2.4% |
| 12.5 | 8 | 2.1% |
| 25 | 8 | 2.1% |
| 40 | 5 | 1.3% |
| 30 | 5 | 1.3% |
| 21 | 4 | 1.1% |
| 33 | 4 | 1.1% |
| Other values (15) | 29 | 7.7% |
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 12.5 | 8 | 2.1% |
| 17.5 | 1 | 0.3% |
| 20 | 15 | 4.0% |
| 21 | 4 | 1.1% |
| 22 | 9 | 2.4% |
| 25 | 8 | 2.1% |
| 28 | 2 | 0.5% |
| 30 | 5 | 1.3% |
| 33 | 4 | 1.1% |
| Value | Count | Frequency (%) |
| 100 | 1 | 0.3% |
| 95 | 1 | 0.3% |
| 90 | 3 | 0.8% |
| 85 | 2 | 0.5% |
| 82.5 | 1 | 0.3% |
| 80 | 12 | |
| 75 | 2 | 0.5% |
| 70 | 3 | 0.8% |
| 60 | 3 | 0.8% |
| 55 | 1 | 0.3% |
| Distinct | 68 |
|---|---|
| Distinct (%) | 17.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.16751979 |
| Minimum | 0.46 |
|---|---|
| Maximum | 27.74 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 0.46 |
|---|---|
| 5-th percentile | 2.18 |
| Q1 | 5.19 |
| median | 9.69 |
| Q3 | 18.1 |
| 95-th percentile | 21.89 |
| Maximum | 27.74 |
| Range | 27.28 |
| Interquartile range (IQR) | 12.91 |
Descriptive statistics
| Standard deviation | 6.875301296 |
|---|---|
| Coefficient of variation (CV) | 0.6156515883 |
| Kurtosis | -1.17240558 |
| Mean | 11.16751979 |
| Median Absolute Deviation (MAD) | 6.28 |
| Skewness | 0.3226806286 |
| Sum | 4232.49 |
| Variance | 47.26976791 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 18.1 | 96 | |
| 19.58 | 23 | 6.1% |
| 8.14 | 15 | 4.0% |
| 6.2 | 15 | 4.0% |
| 21.89 | 11 | 2.9% |
| 9.9 | 9 | 2.4% |
| 3.97 | 9 | 2.4% |
| 5.86 | 9 | 2.4% |
| 10.59 | 8 | 2.1% |
| 8.56 | 8 | 2.1% |
| Other values (58) | 176 |
| Value | Count | Frequency (%) |
| 0.46 | 1 | 0.3% |
| 0.74 | 1 | 0.3% |
| 1.21 | 1 | 0.3% |
| 1.25 | 2 | |
| 1.32 | 1 | 0.3% |
| 1.38 | 1 | 0.3% |
| 1.52 | 3 | |
| 1.69 | 1 | 0.3% |
| 1.89 | 1 | 0.3% |
| 1.91 | 2 |
| Value | Count | Frequency (%) |
| 27.74 | 5 | 1.3% |
| 25.65 | 5 | 1.3% |
| 21.89 | 11 | 2.9% |
| 19.58 | 23 | 6.1% |
| 18.1 | 96 | |
| 15.04 | 2 | 0.5% |
| 13.92 | 5 | 1.3% |
| 13.89 | 3 | 0.8% |
| 12.83 | 5 | 1.3% |
| 11.93 | 4 | 1.1% |
CHAS
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.1 KiB |
| 0.0 | |
|---|---|
| 1.0 | 23 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1137 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 356 | |
| 1.0 | 23 | 6.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0.0 | 356 | |
| 1.0 | 23 | 6.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 735 | |
| . | 379 | |
| 1 | 23 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 758 | |
| Other Punctuation | 379 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 735 | |
| 1 | 23 | 3.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 379 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1137 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 735 | |
| . | 379 | |
| 1 | 23 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1137 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 735 | |
| . | 379 | |
| 1 | 23 | 2.0% |
| Distinct | 78 |
|---|---|
| Distinct (%) | 20.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5547596306 |
| Minimum | 0.392 |
|---|---|
| Maximum | 0.871 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 0.392 |
|---|---|
| 5-th percentile | 0.4109 |
| Q1 | 0.453 |
| median | 0.538 |
| Q3 | 0.624 |
| 95-th percentile | 0.743 |
| Maximum | 0.871 |
| Range | 0.479 |
| Interquartile range (IQR) | 0.171 |
Descriptive statistics
| Standard deviation | 0.1156828559 |
|---|---|
| Coefficient of variation (CV) | 0.2085278913 |
| Kurtosis | -0.05337746777 |
| Mean | 0.5547596306 |
| Median Absolute Deviation (MAD) | 0.086 |
| Skewness | 0.7427763672 |
| Sum | 210.2539 |
| Variance | 0.01338252315 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.538 | 15 | 4.0% |
| 0.713 | 15 | 4.0% |
| 0.693 | 14 | 3.7% |
| 0.489 | 12 | 3.2% |
| 0.437 | 12 | 3.2% |
| 0.871 | 12 | 3.2% |
| 0.605 | 11 | 2.9% |
| 0.624 | 11 | 2.9% |
| 0.507 | 10 | 2.6% |
| 0.431 | 9 | 2.4% |
| Other values (68) | 258 |
| Value | Count | Frequency (%) |
| 0.392 | 2 | 0.5% |
| 0.394 | 1 | 0.3% |
| 0.398 | 1 | 0.3% |
| 0.4 | 3 | |
| 0.401 | 3 | |
| 0.404 | 2 | 0.5% |
| 0.405 | 2 | 0.5% |
| 0.409 | 3 | |
| 0.41 | 2 | 0.5% |
| 0.411 | 5 |
| Value | Count | Frequency (%) |
| 0.871 | 12 | |
| 0.77 | 7 | |
| 0.74 | 6 | 1.6% |
| 0.718 | 5 | 1.3% |
| 0.713 | 15 | |
| 0.7 | 8 | |
| 0.693 | 14 | |
| 0.679 | 6 | 1.6% |
| 0.671 | 5 | 1.3% |
| 0.668 | 1 | 0.3% |
| Distinct | 349 |
|---|---|
| Distinct (%) | 92.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.283593668 |
| Minimum | 3.561 |
|---|---|
| Maximum | 8.725 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 3.561 |
|---|---|
| 5-th percentile | 5.2603 |
| Q1 | 5.89 |
| median | 6.195 |
| Q3 | 6.6185 |
| 95-th percentile | 7.6176 |
| Maximum | 8.725 |
| Range | 5.164 |
| Interquartile range (IQR) | 0.7285 |
Descriptive statistics
| Standard deviation | 0.7137077296 |
|---|---|
| Coefficient of variation (CV) | 0.1135827311 |
| Kurtosis | 1.890679648 |
| Mean | 6.283593668 |
| Median Absolute Deviation (MAD) | 0.336 |
| Skewness | 0.3826854739 |
| Sum | 2381.482 |
| Variance | 0.5093787233 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 6.229 | 3 | 0.8% |
| 6.167 | 3 | 0.8% |
| 6.782 | 2 | 0.5% |
| 5.854 | 2 | 0.5% |
| 5.936 | 2 | 0.5% |
| 6.794 | 2 | 0.5% |
| 6.144 | 2 | 0.5% |
| 6.127 | 2 | 0.5% |
| 5.304 | 2 | 0.5% |
| 6.968 | 2 | 0.5% |
| Other values (339) | 357 |
| Value | Count | Frequency (%) |
| 3.561 | 1 | |
| 3.863 | 1 | |
| 4.138 | 1 | |
| 4.519 | 1 | |
| 4.628 | 1 | |
| 4.652 | 1 | |
| 4.88 | 1 | |
| 4.903 | 1 | |
| 4.906 | 1 | |
| 4.926 | 1 |
| Value | Count | Frequency (%) |
| 8.725 | 1 | |
| 8.704 | 1 | |
| 8.398 | 1 | |
| 8.375 | 1 | |
| 8.337 | 1 | |
| 8.297 | 1 | |
| 8.266 | 1 | |
| 8.259 | 1 | |
| 8.247 | 1 | |
| 8.069 | 1 |
| Distinct | 287 |
|---|---|
| Distinct (%) | 75.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 68.5883905 |
| Minimum | 6 |
|---|---|
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 17.68 |
| Q1 | 43.9 |
| median | 77.3 |
| Q3 | 93.7 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 94 |
| Interquartile range (IQR) | 49.8 |
Descriptive statistics
| Standard deviation | 28.20050794 |
|---|---|
| Coefficient of variation (CV) | 0.4111557033 |
| Kurtosis | -0.948976521 |
| Mean | 68.5883905 |
| Median Absolute Deviation (MAD) | 18.9 |
| Skewness | -0.6236126665 |
| Sum | 25995 |
| Variance | 795.2686479 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 33 | 8.7% |
| 96 | 4 | 1.1% |
| 87.9 | 4 | 1.1% |
| 98.8 | 4 | 1.1% |
| 97.4 | 3 | 0.8% |
| 95.6 | 3 | 0.8% |
| 97 | 3 | 0.8% |
| 95.4 | 3 | 0.8% |
| 88 | 3 | 0.8% |
| 76.5 | 3 | 0.8% |
| Other values (277) | 316 |
| Value | Count | Frequency (%) |
| 6 | 1 | |
| 6.2 | 1 | |
| 6.5 | 1 | |
| 6.6 | 1 | |
| 6.8 | 1 | |
| 7.8 | 1 | |
| 8.4 | 1 | |
| 8.9 | 1 | |
| 9.8 | 1 | |
| 9.9 | 1 |
| Value | Count | Frequency (%) |
| 100 | 33 | |
| 99.3 | 1 | 0.3% |
| 99.1 | 1 | 0.3% |
| 98.9 | 2 | 0.5% |
| 98.8 | 4 | 1.1% |
| 98.7 | 1 | 0.3% |
| 98.5 | 1 | 0.3% |
| 98.4 | 1 | 0.3% |
| 98.3 | 1 | 0.3% |
| 98.2 | 2 | 0.5% |
| Distinct | 321 |
|---|---|
| Distinct (%) | 84.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.776123747 |
| Minimum | 1.1691 |
|---|---|
| Maximum | 12.1265 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 1.1691 |
|---|---|
| 5-th percentile | 1.48736 |
| Q1 | 2.10035 |
| median | 3.1025 |
| Q3 | 5.1167 |
| 95-th percentile | 7.9549 |
| Maximum | 12.1265 |
| Range | 10.9574 |
| Interquartile range (IQR) | 3.01635 |
Descriptive statistics
| Standard deviation | 2.106978124 |
|---|---|
| Coefficient of variation (CV) | 0.5579738021 |
| Kurtosis | 0.7081456303 |
| Mean | 3.776123747 |
| Median Absolute Deviation (MAD) | 1.2524 |
| Skewness | 1.074516209 |
| Sum | 1431.1509 |
| Variance | 4.439356815 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 6.8147 | 4 | 1.1% |
| 3.4952 | 4 | 1.1% |
| 5.1167 | 3 | 0.8% |
| 4.7211 | 3 | 0.8% |
| 6.3361 | 3 | 0.8% |
| 5.4007 | 3 | 0.8% |
| 6.498 | 3 | 0.8% |
| 4.8122 | 3 | 0.8% |
| 3.9454 | 3 | 0.8% |
| 3.6519 | 3 | 0.8% |
| Other values (311) | 347 |
| Value | Count | Frequency (%) |
| 1.1691 | 1 | |
| 1.1742 | 1 | |
| 1.1781 | 1 | |
| 1.2024 | 1 | |
| 1.3216 | 1 | |
| 1.3325 | 1 | |
| 1.3459 | 1 | |
| 1.358 | 1 | |
| 1.3861 | 2 | |
| 1.4118 | 1 |
| Value | Count | Frequency (%) |
| 12.1265 | 1 | |
| 10.7103 | 1 | |
| 10.5857 | 2 | |
| 9.2229 | 1 | |
| 9.2203 | 2 | |
| 9.1876 | 1 | |
| 8.9067 | 2 | |
| 8.7921 | 2 | |
| 8.5353 | 1 | |
| 8.344 | 1 |
| Distinct | 9 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.461741425 |
| Minimum | 1 |
|---|---|
| Maximum | 24 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4 |
| median | 5 |
| Q3 | 24 |
| 95-th percentile | 24 |
| Maximum | 24 |
| Range | 23 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 8.599279417 |
|---|---|
| Coefficient of variation (CV) | 0.9088474341 |
| Kurtosis | -0.7757843183 |
| Mean | 9.461741425 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.044154395 |
| Sum | 3586 |
| Variance | 73.94760648 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=9)
| Value | Count | Frequency (%) |
| 24 | 96 | |
| 4 | 85 | |
| 5 | 84 | |
| 3 | 24 | 6.3% |
| 6 | 21 | 5.5% |
| 8 | 20 | 5.3% |
| 2 | 19 | 5.0% |
| 7 | 16 | 4.2% |
| 1 | 14 | 3.7% |
| Value | Count | Frequency (%) |
| 1 | 14 | 3.7% |
| 2 | 19 | 5.0% |
| 3 | 24 | 6.3% |
| 4 | 85 | |
| 5 | 84 | |
| 6 | 21 | 5.5% |
| 7 | 16 | 4.2% |
| 8 | 20 | 5.3% |
| 24 | 96 |
| Value | Count | Frequency (%) |
| 24 | 96 | |
| 8 | 20 | 5.3% |
| 7 | 16 | 4.2% |
| 6 | 21 | 5.5% |
| 5 | 84 | |
| 4 | 85 | |
| 3 | 24 | 6.3% |
| 2 | 19 | 5.0% |
| 1 | 14 | 3.7% |
| Distinct | 60 |
|---|---|
| Distinct (%) | 15.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 406.2823219 |
| Minimum | 187 |
|---|---|
| Maximum | 711 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 187 |
|---|---|
| 5-th percentile | 222 |
| Q1 | 279 |
| median | 330 |
| Q3 | 666 |
| 95-th percentile | 666 |
| Maximum | 711 |
| Range | 524 |
| Interquartile range (IQR) | 387 |
Descriptive statistics
| Standard deviation | 168.2674301 |
|---|---|
| Coefficient of variation (CV) | 0.4141637994 |
| Kurtosis | -1.102595217 |
| Mean | 406.2823219 |
| Median Absolute Deviation (MAD) | 73 |
| Skewness | 0.6999476326 |
| Sum | 153981 |
| Variance | 28313.92802 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 666 | 96 | |
| 307 | 30 | 7.9% |
| 403 | 23 | 6.1% |
| 437 | 11 | 2.9% |
| 304 | 10 | 2.6% |
| 264 | 9 | 2.4% |
| 330 | 9 | 2.4% |
| 277 | 8 | 2.1% |
| 432 | 8 | 2.1% |
| 384 | 8 | 2.1% |
| Other values (50) | 167 |
| Value | Count | Frequency (%) |
| 187 | 1 | 0.3% |
| 188 | 5 | |
| 193 | 6 | |
| 198 | 1 | 0.3% |
| 216 | 3 | |
| 222 | 6 | |
| 223 | 4 | |
| 224 | 7 | |
| 233 | 5 | |
| 242 | 2 | 0.5% |
| Value | Count | Frequency (%) |
| 711 | 5 | 1.3% |
| 666 | 96 | |
| 437 | 11 | 2.9% |
| 432 | 8 | 2.1% |
| 430 | 3 | 0.8% |
| 422 | 1 | 0.3% |
| 411 | 1 | 0.3% |
| 403 | 23 | 6.1% |
| 398 | 7 | 1.8% |
| 391 | 6 | 1.6% |
| Distinct | 43 |
|---|---|
| Distinct (%) | 11.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.45540897 |
| Minimum | 12.6 |
|---|---|
| Maximum | 22 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 12.6 |
|---|---|
| 5-th percentile | 14.7 |
| Q1 | 17.4 |
| median | 19 |
| Q3 | 20.2 |
| 95-th percentile | 21 |
| Maximum | 22 |
| Range | 9.4 |
| Interquartile range (IQR) | 2.8 |
Descriptive statistics
| Standard deviation | 2.140140684 |
|---|---|
| Coefficient of variation (CV) | 0.1159627883 |
| Kurtosis | -0.2231730165 |
| Mean | 18.45540897 |
| Median Absolute Deviation (MAD) | 1.2 |
| Skewness | -0.8049313636 |
| Sum | 6994.6 |
| Variance | 4.580202147 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=43)
| Value | Count | Frequency (%) |
| 20.2 | 102 | |
| 14.7 | 25 | 6.6% |
| 17.8 | 20 | 5.3% |
| 21 | 19 | 5.0% |
| 17.4 | 15 | 4.0% |
| 19.2 | 15 | 4.0% |
| 19.1 | 14 | 3.7% |
| 16.6 | 13 | 3.4% |
| 18.6 | 13 | 3.4% |
| 18.4 | 13 | 3.4% |
| Other values (33) | 130 |
| Value | Count | Frequency (%) |
| 12.6 | 2 | 0.5% |
| 13 | 9 | 2.4% |
| 13.6 | 1 | 0.3% |
| 14.4 | 1 | 0.3% |
| 14.7 | 25 | |
| 14.8 | 3 | 0.8% |
| 14.9 | 2 | 0.5% |
| 15.1 | 1 | 0.3% |
| 15.2 | 7 | 1.8% |
| 15.5 | 1 | 0.3% |
| Value | Count | Frequency (%) |
| 22 | 2 | 0.5% |
| 21.2 | 11 | 2.9% |
| 21 | 19 | 5.0% |
| 20.9 | 8 | 2.1% |
| 20.2 | 102 | |
| 20.1 | 5 | 1.3% |
| 19.7 | 7 | 1.8% |
| 19.6 | 6 | 1.6% |
| 19.2 | 15 | 4.0% |
| 19.1 | 14 | 3.7% |
| Distinct | 273 |
|---|---|
| Distinct (%) | 72.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 357.7154881 |
| Minimum | 0.32 |
|---|---|
| Maximum | 396.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 0.32 |
|---|---|
| 5-th percentile | 87.788 |
| Q1 | 376.715 |
| median | 392.18 |
| Q3 | 396.22 |
| 95-th percentile | 396.9 |
| Maximum | 396.9 |
| Range | 396.58 |
| Interquartile range (IQR) | 19.505 |
Descriptive statistics
| Standard deviation | 91.54343339 |
|---|---|
| Coefficient of variation (CV) | 0.2559112938 |
| Kurtosis | 7.275504168 |
| Mean | 357.7154881 |
| Median Absolute Deviation (MAD) | 4.72 |
| Skewness | -2.906162192 |
| Sum | 135574.17 |
| Variance | 8380.200196 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 396.9 | 90 | 23.7% |
| 395.6 | 2 | 0.5% |
| 395.69 | 2 | 0.5% |
| 395.63 | 2 | 0.5% |
| 395.56 | 2 | 0.5% |
| 395.24 | 2 | 0.5% |
| 393.68 | 2 | 0.5% |
| 396.06 | 2 | 0.5% |
| 391.34 | 2 | 0.5% |
| 393.23 | 2 | 0.5% |
| Other values (263) | 271 |
| Value | Count | Frequency (%) |
| 0.32 | 1 | |
| 2.52 | 1 | |
| 3.5 | 1 | |
| 3.65 | 1 | |
| 6.68 | 1 | |
| 7.68 | 1 | |
| 10.48 | 1 | |
| 16.45 | 1 | |
| 21.57 | 1 | |
| 22.01 | 1 |
| Value | Count | Frequency (%) |
| 396.9 | 90 | |
| 396.42 | 1 | 0.3% |
| 396.3 | 1 | 0.3% |
| 396.28 | 1 | 0.3% |
| 396.24 | 1 | 0.3% |
| 396.23 | 1 | 0.3% |
| 396.21 | 2 | 0.5% |
| 396.14 | 1 | 0.3% |
| 396.06 | 2 | 0.5% |
| 395.99 | 1 | 0.3% |
| Distinct | 356 |
|---|---|
| Distinct (%) | 93.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.77015831 |
| Minimum | 1.73 |
|---|---|
| Maximum | 36.98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.1 KiB |
Quantile statistics
| Minimum | 1.73 |
|---|---|
| 5-th percentile | 3.588 |
| Q1 | 7.13 |
| median | 11.45 |
| Q3 | 17.115 |
| 95-th percentile | 27.272 |
| Maximum | 36.98 |
| Range | 35.25 |
| Interquartile range (IQR) | 9.985 |
Descriptive statistics
| Standard deviation | 7.182040098 |
|---|---|
| Coefficient of variation (CV) | 0.5624080707 |
| Kurtosis | 0.4149520109 |
| Mean | 12.77015831 |
| Median Absolute Deviation (MAD) | 4.83 |
| Skewness | 0.8993281776 |
| Sum | 4839.89 |
| Variance | 51.58169997 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 8.05 | 3 | 0.8% |
| 18.13 | 3 | 0.8% |
| 15.17 | 2 | 0.5% |
| 7.44 | 2 | 0.5% |
| 8.1 | 2 | 0.5% |
| 6.36 | 2 | 0.5% |
| 12.43 | 2 | 0.5% |
| 9.97 | 2 | 0.5% |
| 5.98 | 2 | 0.5% |
| 3.95 | 2 | 0.5% |
| Other values (346) | 357 |
| Value | Count | Frequency (%) |
| 1.73 | 1 | |
| 1.98 | 1 | |
| 2.47 | 1 | |
| 2.87 | 1 | |
| 2.88 | 1 | |
| 2.94 | 1 | |
| 2.96 | 1 | |
| 2.97 | 1 | |
| 3.11 | 2 | |
| 3.16 | 2 |
| Value | Count | Frequency (%) |
| 36.98 | 1 | |
| 34.77 | 1 | |
| 34.41 | 1 | |
| 34.37 | 1 | |
| 34.02 | 1 | |
| 31.99 | 1 | |
| 30.81 | 1 | |
| 30.62 | 1 | |
| 30.59 | 1 | |
| 29.97 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| CRIM | ZN | INDUS | CHAS | NOX | RM | AGE | DIS | RAD | TAX | PTRATIO | B | LSTAT | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.03961 | 0.0 | 5.19 | 0.0 | 0.515 | 6.037 | 34.5 | 5.9853 | 5.0 | 224.0 | 20.2 | 396.90 | 8.01 |
| 1 | 3.32105 | 0.0 | 19.58 | 1.0 | 0.871 | 5.403 | 100.0 | 1.3216 | 5.0 | 403.0 | 14.7 | 396.90 | 26.82 |
| 2 | 1.20742 | 0.0 | 19.58 | 0.0 | 0.605 | 5.875 | 94.6 | 2.4259 | 5.0 | 403.0 | 14.7 | 292.29 | 14.43 |
| 3 | 0.10612 | 30.0 | 4.93 | 0.0 | 0.428 | 6.095 | 65.1 | 6.3361 | 6.0 | 300.0 | 16.6 | 394.62 | 12.40 |
| 4 | 17.86670 | 0.0 | 18.10 | 0.0 | 0.671 | 6.223 | 100.0 | 1.3861 | 24.0 | 666.0 | 20.2 | 393.74 | 21.78 |
| 5 | 13.52220 | 0.0 | 18.10 | 0.0 | 0.631 | 3.863 | 100.0 | 1.5106 | 24.0 | 666.0 | 20.2 | 131.42 | 13.33 |
| 6 | 14.33370 | 0.0 | 18.10 | 0.0 | 0.700 | 4.880 | 100.0 | 1.5895 | 24.0 | 666.0 | 20.2 | 372.92 | 30.62 |
| 7 | 2.44668 | 0.0 | 19.58 | 0.0 | 0.871 | 5.272 | 94.0 | 1.7364 | 5.0 | 403.0 | 14.7 | 88.63 | 16.14 |
| 8 | 5.58107 | 0.0 | 18.10 | 0.0 | 0.713 | 6.436 | 87.9 | 2.3158 | 24.0 | 666.0 | 20.2 | 100.19 | 16.22 |
| 9 | 1.35472 | 0.0 | 8.14 | 0.0 | 0.538 | 6.072 | 100.0 | 4.1750 | 4.0 | 307.0 | 21.0 | 376.73 | 13.04 |
Last rows
| CRIM | ZN | INDUS | CHAS | NOX | RM | AGE | DIS | RAD | TAX | PTRATIO | B | LSTAT | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 369 | 0.15038 | 0.0 | 25.65 | 0.0 | 0.581 | 5.856 | 97.0 | 1.9444 | 2.0 | 188.0 | 19.1 | 370.31 | 25.41 |
| 370 | 0.17120 | 0.0 | 8.56 | 0.0 | 0.520 | 5.836 | 91.9 | 2.2110 | 5.0 | 384.0 | 20.9 | 395.67 | 18.66 |
| 371 | 0.03551 | 25.0 | 4.86 | 0.0 | 0.426 | 6.167 | 46.7 | 5.4007 | 4.0 | 281.0 | 19.0 | 390.64 | 7.51 |
| 372 | 0.78420 | 0.0 | 8.14 | 0.0 | 0.538 | 5.990 | 81.7 | 4.2579 | 4.0 | 307.0 | 21.0 | 386.75 | 14.67 |
| 373 | 0.53700 | 0.0 | 6.20 | 0.0 | 0.504 | 5.981 | 68.1 | 3.6715 | 8.0 | 307.0 | 17.4 | 378.35 | 11.65 |
| 374 | 0.08187 | 0.0 | 2.89 | 0.0 | 0.445 | 7.820 | 36.9 | 3.4952 | 2.0 | 276.0 | 18.0 | 393.53 | 3.57 |
| 375 | 4.87141 | 0.0 | 18.10 | 0.0 | 0.614 | 6.484 | 93.6 | 2.3053 | 24.0 | 666.0 | 20.2 | 396.21 | 18.68 |
| 376 | 0.35114 | 0.0 | 7.38 | 0.0 | 0.493 | 6.041 | 49.9 | 4.7211 | 5.0 | 287.0 | 19.6 | 396.90 | 7.70 |
| 377 | 9.18702 | 0.0 | 18.10 | 0.0 | 0.700 | 5.536 | 100.0 | 1.5804 | 24.0 | 666.0 | 20.2 | 396.90 | 23.60 |
| 378 | 4.55587 | 0.0 | 18.10 | 0.0 | 0.718 | 3.561 | 87.9 | 1.6132 | 24.0 | 666.0 | 20.2 | 354.70 | 7.12 |